data science resource
The Logjam in AI/ML Platforms is About to Complicate Your Life
We are at an inflection point where too many vendors are offering too many solutions for moving our AI/ML models to production. The very real risk is duplication of effort, fragmentation of our data science resources, and incurring unintended new technical debt as we bind ourselves to platforms that have hidden assumptions or limitations in how that approach problems. Remember when our biggest problem was getting our models off of data science platforms and into production. Well the market is nothing if not efficient and hundreds of platform companies have been laboring away to help solve your pain point. The problem arising for the CDO, CAO or any other CXX is trying to decide which and how many of these you need.
40 Python Statistics For Data Science Resources
For an introduction to statistics, this tutorial with real-life examples is the way to go. The notebooks of this tutorial will introduce you to concepts like mean, median, standard deviation, and the basics of topics such as hypothesis testing and probability distributions. A fine way to start your stats learning, since it is inspired by the books "Think Bayes" and "Think Stats", which are two top recommendations that will come back below! If you're looking for books, you can try out this free book on computational statistics in Python, which not only contains an introduction to programming with Python, but also treats topics such as Markov Chain Monte Carlo, the Expectation-Maximization (EM) algorithm, resampling methods, and much more. Or you can buy this book by Thomas Haslwanter for a general introduction to common statistical tests, linear regression analysis and topics from survival analysis and Bayesian statistics. Note that this book does take life and medical sciences as an application area. Both of the above books already introduce you to more advanced statistics topics with Python too, as you can see. If you're a fan of videos, you should consider watching this tutorial on statistical data analysis with SciPy with Christopher Fonnesbeck, an Assistant Professor in the Department of Biostatistics at the Vanderbilt University School of Medicine.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
65 Free Data Science Resources for Beginners
To become data scientist, you have a formidable challenge ahead. You'll need to master a variety of skills, ranging from machine learning to business analytics. However, the rewards are worth it. Organizations will prize alchemists who can turn raw data into smarter decisions, better products, happier customers, and ultimately more profit. Plus, you'll get to solve interesting problems and master new, impactful technologies.
Top 10 Data Science Resources on Github
In our latest inspection of Github repositories, we focus on "data science" projects. Unlike other searches we have performed over the past several months, nearly all of the repositories which show up (listed by number of stars* in descending order) are resources for learning data science, as opposed to tools for doing. As such, this is much less a software listing than it is a collection of tutorials and educational resources. There are, however, a few software surprises in here as well, such as a data science-oriented IDE and a great notebook-related project. We include, however, the standard informational notification we have placed on our previous Github Top 10 lists: open source tools have been used by 73% of data scientists in the past 12 months, according to a recent KDnuggets survey (and accounting for the 12 months prior to the survey). While the following repositories focus mainly on learning resources, previous offerings have been software-heavy; also, open source learning materials are the new black, and a main source of learning for data scientists these days.
- Information Technology > Artificial Intelligence > Machine Learning (0.51)
- Information Technology > Data Science > Data Mining > Big Data (0.31)